Our project explores the relationship between diet and the infection rate of covid-19. In this presentation, we will go through why we chose this subject, how we constructed our design ideas, what tools we used in the process, and finally, our results.
As I’m sure you’ve noticed, the COVID-19 pandemic has made a drastic change to our everyday lives, and we thought it would be interesting if we could find out something about this topic. We didn’t want to focus too much on heavy subjects on deaths and mortalities, we have enough of that in the news already. Instead we wanted to bring a more positive information about possible daily actions people can take to stay healthy during the pandemic.
While going through the data on Kaggle1, we came upon Bruno Viera Ribeiro’s finished analysis of the project2, and when reading through his process, we found a couple of interesting points that we wanted to zone in by ourselves. One was the possible impact of obesity on COVID-19, to which Bruno investigated the impact on mortality rate while we focused on infected populations. Bruno had also split the obesity category into two levels, high and low, while we split it into three, high, medium, and low, for more detailed investigation. Lastly, we wanted to know what types of food in particular may have an impact on Covid-19 infection rates, which we did using regression analysis
For our project, we used R, Tableau, and Procreate.
In this figure, we can see that countries at a lower obesity level have a lower infection rate.
Then, we focused on Animal and Vegetal food intake of different countries. We can observe that countries with a higher obesity rate have a higher intake of Animal Products and a lower intake of Vegetal Products in terms of calories.
Now, Winnie will talk more about the relationship between diet and obesity.
library(plotly)
library(tidyverse)
DietData <- read.csv("categorized.csv")
animal_features <- c('Animal.fats', 'Aquatic.Products..Other', 'Eggs',
'Fish..Seafood', 'Meat','Milk...Excluding.Butter',
'Offals')
vegetal_features <- c('Alcoholic.Beverages','Cereals...Excluding.Beer',
'Fruits...Excluding.Wine', 'Miscellaneous',
'Oilcrops', 'Pulses', 'Spices', 'Starchy.Roots',
'Stimulants', 'Sugar...Sweeteners','Sugar.Crops',
'Treenuts', 'Vegetable.Oils', 'Vegetables')
DietData_Mean_A <- DietData %>%
#group_by(ObesityCat) %>%
summarize(across(.cols = all_of(animal_features), .fns = mean)) %>%
rename(
`Animal fats` = Animal.fats,
`Aquatic Products Other` = Aquatic.Products..Other,
`Eggs` = Eggs,
`Fish, Seafood` = Fish..Seafood,
`Meat` = Meat,
`Milk - Excluding.Butter` = Milk...Excluding.Butter,
`Offals` = Offals
)
Names <- DietData_Mean_A$ObesityCat
DietData_Mean_T_A <- as.data.frame(t(DietData_Mean_A[,-1]))
colnames(DietData_Mean_T_A) <- Names
DietData_Categorized_A <- data.frame("Categories"=rownames(DietData_Mean_T_A),
DietData_Mean_T_A)
figa <- plot_ly(DietData_Categorized_A,
labels = ~Categories,
values = ~DietData_Mean_T_A,
type = 'pie') %>%
layout(title = 'Mean food intake by Animal products groups')
DietData_Mean_V <- DietData %>%
#group_by(ObesityCat) %>%
summarize(across(.cols = all_of(vegetal_features), .fns = mean)) %>%
rename(
`Alcoholic Beverages` = Alcoholic.Beverages,
`Cereals - Excluding Beer` = Cereals...Excluding.Beer,
`Fruits - Excluding Wine` = Fruits...Excluding.Wine,
`Miscellaneous` = Miscellaneous,
`Oilcrops` = Oilcrops,
`Pulses` = Pulses,
`Spices` = Spices,
`Starchy Roots` = Starchy.Roots,
`Stimulants` = Stimulants,
`Sugar Sweeteners` = Sugar...Sweeteners,
`Sugar Crops` = Sugar.Crops,
`Treenuts` = Treenuts,
`Vegetable Oils` = Vegetable.Oils,
`Vegetables` = Vegetables,
)
Names <- DietData_Mean_V$ObesityCat
DietData_Mean_T_V <- as.data.frame(t(DietData_Mean_V[,-1]))
colnames(DietData_Mean_T_V) <- Names
DietData_Categorized_V <- data.frame("Categories"=rownames(DietData_Mean_T_V), DietData_Mean_T_V)
figb <- plot_ly(DietData_Categorized_V,
labels = ~Categories,
values = ~DietData_Mean_T_V,
type = 'pie') %>%
layout(title = 'Mean food intake by Vegetal products groups')
figa
figb
DietData_Mean <- DietData %>%
group_by(ObesityCat) %>%
summarise(across(.cols = all_of(animal_features), .fns = mean)) %>%
rename(
`Animal fats` = Animal.fats,
`Aquatic Products Other` = Aquatic.Products..Other,
`Eggs` = Eggs,
`Fish, Seafood` = Fish..Seafood,
`Meat` = Meat,
`Milk - Excluding.Butter` = Milk...Excluding.Butter,
`Offals` = Offals
)
Names_All <- DietData_Mean$ObesityCat
DietData_Mean_T <- as.data.frame(t(DietData_Mean[,-1]))
colnames(DietData_Mean_T) <- Names_All
DietData_Categorized <- data.frame("Categories"=rownames(DietData_Mean_T),
DietData_Mean_T)
fig1 <- plot_ly(DietData_Categorized,
x = ~Categories,
y = ~high, type = 'bar',
name = 'High Obesity')
fig1 <- fig1 %>% add_trace(y = ~medium, name = 'Medium Obesity')
fig1 <- fig1 %>% add_trace(y = ~low, name = 'Low Obesity')
fig1 <- fig1 %>% layout(yaxis = list(title = 'Percentage (%)'),
barmode = 'Categories',
title = 'Mean food intake by Animal products')
DietData_Mean <- DietData %>%
group_by(ObesityCat) %>%
summarise(across(.cols = all_of(vegetal_features), .fns = mean)) %>%
rename(
`Alcoholic Beverages` = Alcoholic.Beverages,
`Cereals - Excluding Beer` = Cereals...Excluding.Beer,
`Fruits - Excluding Wine` = Fruits...Excluding.Wine,
`Miscellaneous` = Miscellaneous,
`Oilcrops` = Oilcrops,
`Pulses` = Pulses,
`Spices` = Spices,
`Starchy Roots` = Starchy.Roots,
`Stimulants` = Stimulants,
`Sugar Sweeteners` = Sugar...Sweeteners,
`Sugar Crops` = Sugar.Crops,
`Treenuts` = Treenuts,
`Vegetable Oils` = Vegetable.Oils,
`Vegetables` = Vegetables,
)
Names_All <- DietData_Mean$ObesityCat
DietData_Mean_T <- as.data.frame(t(DietData_Mean[,-1]))
colnames(DietData_Mean_T) <- Names_All
DietData_Categorized <- data.frame("Categories"=rownames(DietData_Mean_T),
DietData_Mean_T)
fig2 <- plot_ly(DietData_Categorized,
x = ~Categories,
y = ~high, type = 'bar',
name = 'High Obesity')
fig2 <- fig2 %>% add_trace(y = ~medium, name = 'Medium Obesity')
fig2 <- fig2 %>% add_trace(y = ~low, name = 'Low Obesity')
fig2 <- fig2 %>% layout(yaxis = list(title = 'Percentage (%)'),
barmode = 'Categories',
title = 'Mean food intake by Vegetal products')
fig1
fig2
We have previously observed a relationship between obesity and infection rate, and in this section we explore the connection between diet and obesity. But we still don’t know if diet has direct implications for infection rates, which we will explore through regression analysis.
We were inspired by Sonja Kuijpers’ visualization “A View on Despair” and hoped make a visually engaging representation of what we found with regression analysis.3
Hence, we decided to create a tower of food as the visualization of our data. The tower is based on the pie chart Winnie just showed, and the amount of different kinds of food all accords with their percentages. The bottom of the tower is constructed by Animal products, the top of Vegetal products. The small red blobs represent Covid-19 virus; their density goes from high to low as the tower grows. The higher people climb on this tower, meaning the higher percentage of Vegetal products they consume, the less likely they are to encounter Covid-19.
If we had more time, we would like to be able to control for the possible confounding variables that may have influenced the infection rates through ways other than food, such as lockdown policy and political and social conflicts that arose during the pandemic. However, seeing that we don’t have any more time, we will conclude our presentation.
Ren, Maria. “COVID-19 Healthy Diet Dataset.” Kaggle, 7 Feb. 2021, www.kaggle.com/mariaren/covid19-healthy-diet-dataset.↩︎
Brunovr. Healthydietvscovid19. 22 Oct. 2020, www.kaggle.com/brunovr/healthydietvscovid19.↩︎
Kuijpers, Sonja. “A View on Despair, a Datavisualization Project by STUDIO TERP.” STUDIO TERP, www.studioterp.nl/a-view-on-despair-a-datavisualization-project-by-studio-terp.↩︎